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Abstract 
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(N 

Lucid programs are data-flow programs and can be visually represented as data flow 
graphs (DFGs) and composed visually. Forensic Lucid, a Lucid dialect, is a language to 
specify and reason about cyberforensic cases. It includes the encoding of the evidence (rep- 
resenting the context of evaluation) and the crime scene modeling in order to validate claims 
against the model and perform event reconstruction, potentially within large swaths of dig- 
ital evidence. To aid investigators to model the scene and evaluate it, instead of typing a 
Forensic Lucid program, we propose to expand the design and implementation of the Lucid 
DFG programming onto Forensic Lucid case modeling and specification to enhance the us- 
ability of the language and the system and its behavior. We briefly discuss the related work 
on visual programming an DFG modeling in an attempt to define and select one approach 
or a composition of approaches for Forensic Lucid based on various criteria such as previous 
implementation, wide use, formal backing in terms of semantics and translation. In the end, 
we solicit the readers' constructive, opinions, feedback, comments, and recommendations 
within the context of this short discussion. 
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1 Overview 

J> Cyberforensic analysis has to do with automated or semi- automated processing of, and reasoning 

about, digital evidence, witness accounts, and other details from cybercrime incidents (involving 
computers, but not limited to them). Analysis is one of the phases in cybercrime investigation 
(while the other phases focus on evidence collection, preservation, chain of custody, information 
extraction that precede the analysis). The phases that follow the analysis are formulation of 
a report and potential prosecution, typically involving expert witnesses. There are quite a 
few techniques, tools (hardware and software), and methodologies have been developed for the 
mentioned phases of the process. A lot of attention has been paid to the tool development for 
evidence collection and preservation; a few tools have been developed to aid data "browsing" on 
the confiscated storage media, log files, memory, and so on. A lot less number of tools have been 
developed for case analysis of the data (e.g. Sleuthkit), and the existing commercial packages 
(e.g. Encase or FTK) are very expensive. Even less so there are case management, event 
modeling, and event reconstruction, especially with a solid formal theoretical base. The first 
formal approach to the cybercrime investigation was the finite-state automata (FSA) approach 
by Gladyshev et. al [HIE]. Their approach, however, is unduly complex to use and to understand 
for non-theoretical-computer science or equivalent minded investigators. 



1 



The Need of DFGs for Forensic Lucid Programs in GIPSY 



Mokhov, Paquet, Debbabi 



The aim of Forensic Lucid is to alleviate those difficulties, be sound and complete, expressive 
and usable, and provide even further usability improvements with the GUI to do data-flow graph- 
based (DFG) programming that allows translation between DFGs and the Forensic Lucid code 
for compilation and evaluation. In a previous related work a similar solution for Indexical Lucid 
was implemented in the General Intensional Programming System (GIPSY) already f5j, but 
requires additional forensic and imperative extensions. 

The goal of Forensic Lucid in the cyberforensic analysis is to be able to express in a program 
form the encoding of the evidence, witness stories, and evidential statements, that can be tested 
against claims to see if there is a possible sequence or multiple sequences of events that explain 
a given story. As with the Gladyshev's FSA, it is designed to aid investigators to avoid ad- 
hoc conclusions and have them look at the possible explanations the Forensic Lucid program 
"execution" would yield and refine the investigation, as was shown in the works by Gladyshev et 
al. [9l [8] where hypothetical investigators failed to analyze all the "stories" and their plausibility 
before drawing conclusions. 

In Figure [T] [22] is a general design overview of the Forensic Lucid compilation and evaluation 
process involving various components and systems. Of main interest to this work are the inputs 
to the compiler - the Forensic Lucid fragments (hierarchical context representing the evidence 
and witness accounts) and programs (descriptions of the crime scenes as transition functions) 
can come from different sources, including the visual interactive DFG editor that would be used 
by the investigators at the top-right corner of the image. Once the complete evidential knowl- 
edge of the case and the crime scene model are composed, the whole specification is compiled 
by the compiler depicted as GIPC on the image (General Intensional Program Compiler). The 
compiler produces an intermediate version of the compiled program as an AST and a contextual 
dictionary of all identifiers among other things, that evaluation engines (under the GEE com- 
ponent) understand. The proposed Forensic Lucid engines are designed to use the traditional 
eduction, AspectJ-based tracing, and probabilistic model checking with PRISM. 

2 Related Work 

There are a number of items and proposals in graph-based visualization and the corresponding 
languages. 

In GIPSY, our own work in the area includes the theoretical foundation and initial practical 
implementation of the DFGs |25t 15]. First, Faustini proved that any Indexical Lucid program 
can be represented as a DFG [7]; Paquet subsequently expanded on this for multidimensional 
intensional programs as e.g. shown in Figure [2] [25J. Ding further materialized to a good extent 
Paquet's notion within the GIPSY projects in [5] in 2004 using [3]'s lefty's GUI and dot 
languages [2] along with bi-directional translation between GIPL's or Indexical Lucid's abstract 
syntax trees (ASTs) to dot's and back. 

Additionally, a part of the proposed related work on visualization and control of communica- 
tion patterns and load balancing idea was to have a "3D editor" within RIPE's DemandMonitor 
that will render in 3D space the current communication patterns of a GIPSY program in execu- 
tion or replay it back and allow the user visually to redistribute demands if they go off balance 
between workers. A kind of virtual 3D remote control with a mini expert system, an input from 
which can be used to teach the planning, caching, and load-balancing algorithms to perform 
efficiently next time a similar GIPSY application is run as was proposed in [15]. Related work 
by several researchers on visualization of load balancing, configuration, formal systems for dia- 
grammatic modeling and visual languages and the corresponding graph systems are presented 
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Figure 1: Forensic Lucid Compilation and Evaluation Flow in GIPSY 
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in |30t [28l [TJ [3J [TH [23] . They all define some key concepts that are relevant to our visualization 
mechanisms within GIPSY and its corresponding General Manager Tier (GMT) [12J. 

We propose to build upon those works to represent the nested evidence, crime scene as a 
2D or even 3D DFG, and the reconstructed events flow upon evaluation. Such a feature is 
projected in the near future to support the previous work on intensional forensic computing, 
evidence modeling and encoding, and Forensic Lucid p21 H3 [20j [21] and MARFL [T71CL6] (where 
the intensional hybrid programming languages are being realized within the GIPSY platform 
to investigate the languages' properties and test the run-time aspects thereof) in order to aid 
investigator's tasks to build and evaluate digital forensic cases. 



Examples 

For that related work an conceptual example of a 2D DFG corresponding to a simple Lucid 
program is in Figure [2] The actual current rendering of such graphs is exemplified in Figure [3] 
from Ding [5] in the GIPSY environment. 

In Figure [7] is the conceptual hierarchical nesting of the evidential statement es context 
elements, such as observation sequences os, their individual observations o (consisting of the 
properties being observed (P,min,max,w,t), details of which are discussed in the referenced 
related works). These 2D conceptual visualizations are proposed to be renderable at least in 
2D or in 3D via an interactive interface to allow modeling complex crime scenes and multidi- 
mensional evidence on demand. The end result could look like something expanding or "cutting 
out" nodes or complex- type results conceptually exemplified in Figure [4] 



N@.d2 




Figure 2: Canonical Example of a 2D Data Flow Graph-Based Program 
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Figure 3: Example of an Actual Rendered 2D Data Flow Graph-Based Program with Graphviz 




Figure 4: Modified Example of a 2D Data Flow Graph-based Program with 3D Elements. Cutout 
image credit is that of Europa found on Wikipedia http://en.wikipedia.org/wiki/FileTI 
PIA0li30"_Interior_of _Europa7jpg from NASA 
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Figure 5: Conceptual Example of a 3D Observation Node. Cutout image credit is that of Europa 
found on Wikipedia http://en.wikipedia.Org/wiki/File:PIA01130_Interior_of_Europa. 
jpg| from NASA 




Figure 6: Example of a BPEL Graph with Asynchronous Flows 
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Figure 7: Nested Context Hierarchy Example for Cyberforensic Investigation [18] 



3 Visualization of Forensic Lucid 

3.1 3 Dimensions? 

The need to represent visually forensic cases, evidence, and other specification components 
is obvious for usability and other issues. Placing it in 3D helps to structure the "program" 
(specification) and the case in 3D space can help arrange and structure the case in a virtual 
environment better with the evidence items encapsulated in 3D balls like Russian dolls, and can 
be navigated in depth to any level of detail via clicking (cf . Figure [5]) . 

The depth and complexity of operational semantics and demand-driven (eductive) execution 
model are better represented and comprehended visually in 3D especially when doing event 
reconstruction. Ding's implementation allows navigation from a graph to a graph by expanding 
more complex nodes to their definitions, e.g. more elaborate operators such whenever (wvr) or 
advances upon (upon). 

3.2 Requirements Summary 

Some immediate requirements to realize the envisioned DFG visualization of Forensic Lucid 
programs and their evaluation: 

• Visualization of the hierarchical evidential statements (potentially deeply nested context), 
cf. Figure [7| 

• Placement of hybrid intensional-imperative nodes into the DFGs such as mixing Java and 
Lucid program fragments. The GIPSY research and development group's previous works 
did not deal with the way on how to augment the DFGAnalyzer and DFGGenerator of Ding 
to support hybrid GIPSY programs. This can be addressed by adding an unexpandable 
imperative DFG node to the graph. To make it more useful, i.e. expandable and so 
it's possible to generate the GIPSY code off it or reverse it back. The newer versions of 
Graphviz also have new support features that are more usable for our needs at the present. 
Additionally, with the advent of JOOIP [29 the Java 5 ASTs are available made available 
along with embedded Lucid fragments that can be tapped into when generating the dot 
code's AST. 

• Java-based wrapper for the DFG Editor of Yimin Ding [5] to enable its native use withing 
Java-based GIPSY and plug-in IDE environments like Eclipse. 
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3.3 Selection of the Language and Tools 

One of the goals of this work is to find the optimal technique, with soundness and completeness 
and formal specifications ease of implementation and usability; thus we'd like to solicit opinions 
and insights of this work in selecting the technique or a combination of techniques, which seems 
a more plausible outcome. 

The current design allows any of the implementation to be chosen or a combination of them. 

Graphviz 

First, the most obvious is Ding's j5] basic DFG implementation within GIPSY as it is already 
part of the project and done for the two predecessor Lucid dialects. Additionally, the moder 
version of Graphviz now also has integration with Eclipse [B], so GIPSY's IDE - RIPE (Run-time 
Interactive Programming Environment) - may very well be the an Eclipse-based plug-in. 

PureData 

Puckette came up with the PureData |26j language and its commercial offshoots, which also 
employ DFG-like programming with boxes and inlets and outlets of any data types graphi- 
cally collected and allowing sub-graphs and external implementations of inlets in procedural 
languages. Puckette's original design was targetting signal processing for electronic music and 
video processing and production for interactive artistic and performative processes but has since 
outgrown that notion. The PureData externals allow deeper media visualizations in OpenGL, 
video, etc. thereby potentially enhancing the whole aspect of the process significantly. 

BPEL 

The BPEL (Business Process Execution Language) and its visual realization within NetBeans 
[27\ [23] for SOA (service-orient architectures) and web services is another good model for inspi- 
ration \13\ 110] that has recently undergone a lot of research and development, including flows, 
picking structures, and faults, parallel/asynchronous and sequential activities. More impor- 
tantly, BPEL notations have a backing formalizm modeled upon based on Petri nets (see e.g. 
visual BPEL graph in BPEL Designer (first came with the NetBeans IDE) in Figure [6] that illus- 
trates two flows and 3 parallel activities in each flow as well asynchrony and timeouts modeling. 
This specification actually translates to an executable Java web services code). 

4 Conclusion 

With the goal to have a visual DFG-based tool to model Forensic Lucid case specification we 
deliberate on the possible choice of the languages and paradigms within today's technologies and 
their practicality and attempt to build upon previous sound work in this area. Main choices so 
far identified include Ding-derived Graphviz-based implementation, PureData-based, or BPEL- 
like. All languages are more or less industry standards and have some formal backings; the 
ones that don't may require additional work on to formally specify their semantics and prove 
correctness and sounds of translation to and from Forensic Lucid. 

The main problem with PureData and Graphviz'es dot is that their languages do not have 
formal semantics specified only some semantic notes and lexical and grammatical structures 
(e.g. see dot's [2]). If we use any and all of these, we will have to provide translation rules 
and their semantics and equivalence to the original Forensic Lucid specification similarly as it 
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is e.g was done by Jarraya for the UML 2.0/SysML state/activity diagrams and probabilities in 
when translating to PRISM or equivalently for Forensic Lucid to PRISM translation to do 
model-checking. 

Thus, this work at this stage is to solicit comments and recommendations on the proposed 
choices for the task. Given the author's some familiarity with all three languages, the final 
choice may result being and intermediate form or a collection of forms mutually translatable. 



Acknowledgments 

This work was supported in part by NSERC and the Faculty of Engineering and Computer 
Science, Concordia University, Montreal, Canada. 



References 

Gerard Allwein and Jon Barwise, editors. Logical reasoning with diagrams. Oxford University Press, 
Inc., New York, NY, USA, 1996. 



AT&T Labs Research and Various Contributors. The DOT language, [online], 1996-2011. http: 
/ /www . graphviz . org/pub/scm/graphviz2/doc/inf o/lang . html . 

AT&T Labs Research and Various Contributors. Graphviz - graph visualization software, [online], 
1996-2011. |http : //www . graphviz . org/| 

R. Bardohl, M. Minas, G. Taentzer, and A. Schiirr. Application of graph transformation to visual 
languages. In Handbook of Graph Grammars and Computing by Graph Transformation: Applications, 
Languages, and Tools, volume 2, pages 105-180. World Scientific Publishing Co., Inc., River Edge, 
NJ, USA, 1999. 

Yimin Ding. Automated translation between graphical and textual representations of intcnsional 
programs in the GIPSY. Master's thesis, Department of Computer Science and Software Engineering, 



Concordia University, Montreal, Canada, June 2004. http://newton.cs.concordia.ca/~paquet/ 


f iletransf er/publications/theses/DingYiminMSc2004.pdf 





Eclipse contributors ct al. Eclipse Platform, eclipse.org, 2000-2011. http://www.eclipse.org last 
viewed February 2010. 

Anthony A. Faustini. The Equivalence of a Denotational and an Operational Semantics of Pure 
Dataflow. PhD thesis, University of Warwick, Computer Science Department, Coventry, United 
Kingdom, 1982. 

Pavel Gladyshev. Finite state machine analysis of a blackmail investigation. International Journal 
of Digital Evidence, 4(1), 2005. 

Pavel Gladyshev and Ahmed Patel. Finite state machine approach to digital event reconstruction. 
Digital Investigation Journal, 2(1), 2004. 

IBM, BEA Systems, Microsoft, SAP AG, and Siebel Systems. Business Process Execution Language 
for Web Services version 1.1. [online], IBM, February 2007. http : //www. ibm. com/developerworks/ 
library/specif ication/ws-bpel/. 

Yosr Jarraya. Verification and Validation of UML and SysML Based Systems Engineering Design 
Models. PhD thesis, Department of Electrical and Computer Engineering, Concordia University, 
Montreal, Canada, February 2010. 

Yi Ji. Scalability evaluation of the GIPSY runtime system. Master's thesis, Department of Computer 
Science and Software Engineering, Concordia University, Montreal, Canada, March 2011. 
Dieter Koenig. Web services business process execution language (WS-BPEL 2.0): The standards 
landscape. Presentation, IBM Software Group, 2007. 

N. G. Miller. A Diagrammatic Formal System for Euclidean Geometry. PhD thesis, Cornell Univer- 
sity, U.S.A, 2001. 



9 



The Need of DFGs for Forensic Lucid Programs in GIPSY 



Mokhov, Paquet, Debbabi 



[15] Serguei A. Mokhov. Towards hybrid intensional programming with JLucid, Objective Lucid, and 
General Imperative Compiler Framework in the GIPSY. Master's thesis, Department of Computer 
Science and Software Engineering, Concordia University, Montreal, Canada, October 2005. ISBN 



0494102934; online at http://arxiv.org/abs/0907.2640 



[16] Serguei A. Mokhov. Encoding forensic multimedia evidence from MARF applications as Forensic 
Lucid expressions. In Tarek Sobh, Khalcd Elleithy, and Ausif Mahmood, editors, Novel Algorithms 
and Techniques in Telecommunications and Networking, proceedings of CISSE'08, pages 413-416, 
University of Bridgeport, CT, USA, December 2008. Springer. Printed in January 2010. 

[17] Serguei A. Mokhov. Towards syntax and semantics of hierarchical contexts in multimedia process- 
ing applications using MARFL. In Proceedings of the 32nd Annual IEEE International Computer 
Software and Applications Conference (COMPSAC), pages 1288-1294, Turku, Finland, July 2008. 
IEEE Computer Society. 

[18] Serguei A. Mokhov, Joey Paquet, and Mourad Debbabi. Formally specifying operational semantics 
and language constructs of Forensic Lucid. In Oliver Gobel, Sandra Frings, Detlef Giinther, Jens 
Nedon, and Dirk Schadt, editors, Proceedings of the IT Incident Management and IT Forensics 
(IMF'08), LNI140, pages 197-216. GI, September 2008. 
[19] Serguei A. Mokhov, Joey Paquet, and Mourad Debbabi. Towards automated deduction in blackmail 
case analysis with Forensic Lucid. In Joseph S. Gauthier, editor, Proceedings of the Huntsville 
Simulation Conference (HSC'09), pages 326-333. SCS, October 2009. Online at |http://arxiv.| 



org/abs/0906.0049 



[20] Serguei A. Mokhov, Joey Paquet, and Mourad Debbabi. Towards automatic deduction and event 
reconstruction using Forensic Lucid and probabilities to encode the IDS evidence. In S. Jha, 
R. Sommer, and C. Kreibich, editors, Proceedings of RAID '10, LNCS 6307, pages 508-509. Springer, 
September 2010. 

[21] Serguei A. Mokhov and Emil Vassev. Self-forensics through case studies of small to medium software 
systems. In Proceedings of IMF'09, pages 128-141. IEEE Computer Society, September 2009. 

[22] Serguei A. Mokhov, Emil Vassev, Joey Paquet, and Mourad Debbabi. Towards a self-forensics 
property in the ASSL toolset. In Proceedings of C3S2E'10, pages 108-113. ACM, May 2010. 



[23] NetBeans Community. NetBeans Integrated Development Environment, [online], 2004-2011. http: 
|//www.netbeans~. org 



[24] OpenESB Contributors. BPEL service engine, [online], 2009. https://open-esb.dev.java.net/ 
IBPELSE . htmll 

[25] Joey Paquet. Scientific Intensional Programming. PhD thesis, Department of Computer Science, 
Laval University, Sainte-Foy, Canada, 1999. 



[26] Miller Puckette and PD Community. Pure Data, [online], 2007-2011. http://puredata.org 
[27] Sun Microsystems, Inc. NetBeans 6.7.1. [online], 2009-2010. http://netbeans.0rg/downloads/6. 
7. 1/ index. html 

[28] Phan C. Vinh and Jonathan P. Bowen. On the visual representation of configuration in reconfigurable 

computing. Electron. Notes Theor. Comput. Sci., 109:3-15, 2004. 
[29] Ai Hua Wu. OO- IP Hybrid Language Design and a Framework Approach to the CIPC. PhD 

thesis, Department of Computer Science and Software Engineering, Concordia University, Montreal, 

Canada, 2009. 

[30] Chunfang Zheng and J. Robert Heath. Simulation and visualization of resource allocation, control, 
and load balancing procedures for a multiprocessor architecture. In MS' 06: Proceedings of the 17th 
IASTED international conference on Modelling and simulation, pages 382-387, Anaheim, CA, USA, 
2006. ACTA Press. 



10 



Index 

API 

DemandMonitor, [2] 
DFGAnalyzer, [7J 
DFGGenerator, [7j 
upon, [7J 
wvr, [7J 
Aspect J, [2] 

DFG,§i[7j[8] 

Forensic Lucid, [7^9] 
Frameworks 

GEE, [2] 

GIPC,[2] 

RIPE, §[8] 

GEE, [2] 
GIPC,[2] 

GIPSY, [IH10i 
Graphviz, |7j [8] 

Indexical Lucid, [2] 

Java, [JJ 
JOOIP, [7J 

Lucid, [TJ[7J 

MARFL, [4] 

PureData, [8] 

RIPE, §[8] 

Tools 

dot, §|7H8| 
Graphviz, [7J [8] 
lefty,[2] 
PureData, [8] 



